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Abstract. The classical preferential attachment model is sensitive to the choice of the initial configuration 
of the network. As the number of initial nodes and their degree grow, so does the time needed for an 
equilibrium degree distribution to be established. We study this phenomenon, provide estimates of the 
equilibration time, and characterize the degree distribution cutoff observed at finite times. When the initial 
network is dense and exceeds a certain small size, there is no equilibration and a suitable statistical test can 
always discern the produced degree distribution from the equilibrium one. As a by-product, the weighted 
Kolmogorov-Smirnov statistic is demonstrated to be more suitable for statistical analysis of power-law 
distributions with cutoff when the data is ample. 

PACS. 64.60.aq Networks - 89. 75. He Networks and genealogical trees - 01.75.+m Science and society 



1 Introduction 

The preferential attachment (PA) model proposed by Ba- 
rabasi and Albert is a network growth model where new 
nodes gradually appear and connect to existing nodes with 
probability proportional to the target node's degree [T] 
(other frequently-used synonyms for this mechanism are 
rich-get-richer and cumulative advantage). Although not 
the first of its kind [3] , PA became popular for its simplic- 
ity and for producing a stationary power-law degree dis- 
tribution which makes it a good candidate for modeling a 
wide range of real systems where heavy-tailed degree dis- 
tributions are often observed O Ch. 3] . The model helped 
to initiate the young field of complex networks OHHS] arL d 
it has been subsequently much studied and generalized 
(see in particular (6j Ch. 8] for an overview of analytical 
approaches to its solution and generalizations). 

Significant evidence for preferential attachment has 
been found in various real datasets [TUHlEl but some im- 
portant deviations have been reported too [ 1 Oil 1 1 j , mainly 
in relation with the strong time bias of the model which 
causes that high degree nodes (the heavy tail) are almost 
exclusively those that were introduced in the early stage 
of the network's evolution. In the original PA model, if 
the network growth starts with two connected nodes (a 
so-called dyadic initial condition) and every new node cre- 
ates one link, a node introduced at time step i has at time 
t expected degree y/t/i which decreases fast with i. (Since 
the distribution of nodes is uniform in i, this relation can 
be used to derive the well-known 1/fc 3 degree distribution 
in an especially simple way.) The drawback of time bias 



matus.medo@unifr.ch 



has been eliminated only recently by a model |12j where 
aging of nodes makes it possible also for late introduced 
nodes to gain a significant number of links. Various other 
models of growing networks with aging of nodes exist and 
differ in their scope and behavior |13lH^] . 

As networks rarely grow from a single starting node, 
we investigate the influence of an initial network of nodes 
on the original PA model. How is the stationary degree dis- 
tribution formed and what is its functional form? To this 
end, we first show that if the degree of nodes in the initial 
network is greater than a certain threshold value (which 
we find to be approximately 3), the initial nodes do not 
become part of the eventual power-law degree distribution 
of the network. To assess the approaching of the degree 
distribution of newly added nodes to a power-law form, 
we propose three quantities of interest and study their 
evolution with time. This leads to estimates of the distri- 
bution's equilibration time which are then interpreted in 
the context of the quantities used to obtain them. 

When performing the goodness-of-fit of the network 
degree distributions, we find a divergence between results 
obtained with the Kolmogorov-Smirnov statistic used for 
statistical tests of power-law distributions in 15 and those 
obtained with the weighted Kolmogorov-Smirnov statistic 
introduced in [TFj . We show that this difference is due to 
a cutoff of the network degree distributions and investi- 
gate the behavior and shape of this cutoff under various 
conditions. Our results reveal high sensitivity of the PA 
model to the initial network configuration which, to our 
best knowledge, has not been reported previously. Fur- 
thermore, significant differences exist between the ability 
of the standard and weighted Kolmogorov-Smirnov statis- 
tic to detect a power-law cutoff in empirical data. Note 
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that finite size effects and sensitivity to the initial condi- 
tion in the PA model have been studied already in [17] 
where however no results were provided for the equilibra- 
tion time and the degree distribution cutoff. 



2 PA model with multiple initial nodes 

We study the PA model starting with an initial random 
network of no nodes with mean degree /iq where in every 
time step one node is added and creates a link to an ex- 
isting node selected according to preferential attachment. 
The network thus consists of no + t nodes after time step t. 
For the sake of clarity, nodes constituting the initial net- 
work are referred to as initial nodes while all gradually 
added nodes are referred to as new nodes. 

The degree distribution of the initial nodes, Pk.t, can 
be studied by the standard master-equation approach |18j . 
Denoting the mean degree of the initial nodes at time 
t as /it, PA dictates that a link created at time step t 
connects to one of the initial nodes with the probability 
Qt = (fioA t t)/( ri oMo+2i) where no"o+2t is the total degree 
of all nodes at time t. The master equation for pk,t follows 
in the form 



Pk,t+i - Pk,t 



(k - l)p fc _i 



hp, 



k.t 



n 0h i a 



2t 
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By multiplying this with k or k 2 and summing over all 
k, we obtain a difference equation for (k(t)) or (k(t) 2 ), 
respectively. A continuous time approximation then yields 
the average degree of the initial nodes, fit '■= (k(t)), and 
their average standard deviation, at '■= (k(t) 2 ) — (k(t)) 2 , 
in the form 
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2.1 Separation of the initial nodes 

We now examine whether the well-known stationary de- 
gree distribution of the original PA model 



/(*) 



fc(fc+l)(fc + 2) 



(3) 



can form in the presence of the initial nodes. To do that, 
we compare the number of the initial nodes with degree 
[i t and the number of new nodes with this degree which, 
according to Eq. ©, is At/[fi t {fit + 1)(m* + 2 )]. If the for- 
mer number is greater than the latter, contribution of the 
initial nodes significantly distorts the expected form of 
f(k) given above. Assuming that the degree distribution 
of the initial nodes is approximately Gaussian, there are 
roughly no/ 1 ^/2ira 2 of them with degree fit- The initial 
nodes thus separate from the equilibrium degree distribu- 
tion f(k) when 
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Fig. 1. Number of nodes of degree k, n(k), for simulated PA 
networks with t = 2 • 10 5 added nodes: blue circles and red 
diamonds correspond to the new and initial nodes, respectively. 
Vertical lines mark the maximal degree of the new nodes (blue, 
solid) and the mean degree of the initial nodes (red, dashed). 
Separation of the initial nodes does not occur for fio = 3 (top) 
but it is clearly visible for two distinct choices of no and fio 
where fio 3> 3 (down). 



Letting t — > oo, we find that this inequality is always 
fulfilled for 

>2^F«3. (5) 



Mo 

Hence regardless of the initial network size no and the 
number of the new nodes t, the degree distribution of the 
initial nodes separates from that of the new nodes as long 
as Mo ^ 3. Figure [T] shows cases of merging and separation 
of the degree distributions for various values of no and fi®. 
It confirms that when condition Eq. is met, the initial 
nodes remain well separated and visible in the degree dis- 
tribution regardless of the values of no and t. From now 
on, we thus focus on the degree distribution of the new 
nodes only and verify whether at least this can take the 
expected power-law form and when that happens. 



3 Equilibration time 

The degree distribution of the new nodes can be solved by 
master-equation in the large time limit. Despite the influ- 
ence of the initial nodes at the beginning of the network's 
growth, the resulting distribution can be shown to be of 
the same form as for the original PA model, see Eq. ([3]). 
To assess the time needed to achieve this equilibrium dis- 
tribution, we employ three different approaches. For the 
sake of simplicity, we assume a complete initial network 
in this section, i.e., fio = no — 1. 



3.1 Mean degree of the new nodes 

In the early stage of the network's evolution, links from the 
new nodes initially frequently attach to the initial nodes. 



Y. Berset and M. Medo: The effect of the initial network configuration on preferential attachment 



3 



This causes the mean degree of the new nodes to be con- 
siderably lower than the overall mean degree which is two. 
Denoting the mean degree of the new nodes at time t as 
M t , the total number of links in the network, no/io + 2i 
can be expressed as tM t + no^t- We can therefore use the 
previously obtained result for u t to obtain 



M, = 2 - 



Ma** - Mo) 



t 



(0) 



which has the long time limit = 2. To characterize 
the equilibration, we compute the time needed to reach 
Alt = (1 — t)Moo which follows in the form 



tr 



(1 - 2e)n 5 



O(n ) 



(7) 



2e 2 v u/ 2e 2 

for large hq and small e. The equilibration time given by 
the mean degree of the new nodes thus grows quadratically 
with uq. It is straightforward to verify that in the case of 
a general initial network with /io < no — 1, this result 
changes to t cq sa tiq^lq/ {2e 2 ). 

3.2 Maximal degree of the new nodes 

We now consider the highest degree of a new node as 
an equilibration criterion. When the maximal degree ob- 
served in numerical simulations reaches the theoretically 
expected value following from the stationary distribution 
given by Eq. (J3|), we say that the degree distribution has 
equilibrated. 

To compute the expected maximum degree value (k m ), 
we study the extreme statistics for t draws from the equi- 
librium distribution f(k). Following the steps described 
in [TH] , the probability that the highest degree value is fc m 
has the form 



p(k m ) = tf(k m ) 

Approximating (1 — x) l ~ l ~ 
it 
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This sum is easy to compute numerically but one can 
also estimate its value by roughly approximating the ex- 
ponential term e~ ax with one for x £ [0, 1/a] and zero for 
x £ (1/a, oo). This yields the expected value 



St. 



(10) 



A comparison of this result with a numerical summation of 
Eq. ((5]) shows that when t is large, the true value of (fc m ) is 
overestimated by less than 15%. While the average value 
of k m following from simulations, k m , is also proportional 
to y/t, it always holds that (k m ) > k m and the gap be- 
tween the two quantities grows with the number of the 
initial nodes no (see Figure [5]) ■ We can conclude that no 
equilibration time can be defined here and the extreme de- 
gree statistics suggests that the degree distribution of the 
new nodes never reaches the stationary form prescribed 
by Eq. ©. 




Fig. 2. Analytical results for the mean maximal degree 
(showed with the dashed line) and simulation results for the 
mean maximal degree at various values of no (assuming a com- 
plete initial network, i.e., /io = no — 1). Results are averaged 
over 1000 network realizations. 



3.3 Fitting the network degree distributions 

We finally study the agreement between functional forms 
of the simulated and the equilibrium degree distribution, 
respectively. The standard approach to this task is a so- 
called goodness-of-fit test. Given a set of observed data 
and an expected statistical distribution, it measures how 
much the data fluctuates from the expected distribution 
compared to artificial data drawn from this distribution. 
In particular, we adopt a procedure presented in [15] es- 
pecially for statistical analysis of power-law distributions 
which goes as follows. For an input realization of the net- 
work at time t (i.e., after adding t new nodes), one com- 
putes the cumulative degree distribution of the new nodes, 
R(k), and the cumulative degree distribution of the ex- 
pected distribution, T(k) := Ylk>=k fi^')- The Kolmo- 
gorov-Smirnov statistic (KS) introduces the distance be- 
tween the two cumulative distributions 



Do 



max \T(k) 

k 



R(k)\. 



(11) 



One then generates a large number of artificial datasets 
following the expected distribution and having the same 
size as the input data and computes the Kolmogorov- 
Smirnov statistic D\ for them. The fraction of datasets 
with Di > Dq then gives p-value of the fit between the 
input degree distribution R(k) and the expected degree 
distribution. By averaging this result over various realiza- 
tions of the network, we obtain the final p- values which are 
reported here. We significantly speed up the computation 
by using the same set of artificial datasets and their D\ 
values to evaluate each network realization at a given time 
t. The hypothesis that the network degree data follows 
the expected distribution Eq. ^ is then evaluated on the 
basis of the resulting p- value. If p < 0.1, the hypothesis 
is rejected. In other words, the hypothesis of agreement 
is plausible as long as at least 10% of the the artificial 
data agree less with the expected distribution than sim- 
ulated network data do. The same procedure can be car- 
ried out using the Anderson-Darling statistic Qj5] (which 
is referred to as weighted Kolmogorov-Smirnov statistic 
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Fig. 3. Scaling of the p- value-based equilibration time i eq with 
the number of initial nodes no for the complete initial network 
(fj,o = no — 1) and the initial network with fixed degree (uo = 9). 
Numerical results and corresponding linear fits are shown with 
symbols and dashed lines, respectively. The dashed lines have 
slopes 4.f8 and 2.13, respectively. 



(WKS) in [H]) 



D* 



max ™-R{k)\ 



T(k)} 



(12) 



The corresponding p- value is denoted p* . 

Equilibration time can be denned based on when p 
reaches the threshold value of 0.1 and the stationary dis- 
tribution /(fc) therefore becomes a plausible hypothesis for 
simulated networks. Figure [3] shows that t cq scales with no 

as t eq ~ np where j3 = 4. 18 ±0.02 for fiQ = uq — 1 (applies 
for no > 30) and j3 = 2.13 ± 0.01 for constant /io (ap- 
plies for no > 40). Note that similarly as before, we have 
the scaling exponent for complete initial networks twice 
as high as for initial networks with fixed /io. What is dif- 
ferent from equilibration based on the average degree of 
the new nodes is that for both fixed and growing /^o, we 
observe much faster growth of t cq with no. 

Very recently, a new goodness-of-fit test has been pro- 
posed [20) which also relies on the KS statistic but circum- 
vents the p-value testing. This approach is distribution- 
free and focuses only on whether the KS statistic of a data 
set is higher than a certain threshold value. In particular, 
the hypothesis that the given data follows a power-law dis- 
tribution can be discarded with 90% confidence when its 
KS statistic Do is higher than 1.224/ y/t for data size t (for 
95% confidence level, the threshold would be 1.358/y/i as 
reported in [20]). Besides saving computational time (no 
artificial data sets need to be generated here) , this method 
provides scaling exponents (3 that match well with the ones 
derived above. We can conclude that a very long time is 
needed to achieve network degree distributions that are 
accepted to be compatible with the equilibrium degree 
distribution by the standard Kolmogorov-Smirnov test. 

While p- values follow the expected scenario and grow 
with t, thus allowing a new equilibration time to be in- 
troduced, simulations show that p*-values based on the 
WKS are essentially independent of t. As soon as /io > 10, 
p* < 0.1 for any value of t (except for very low t where 



Fig. 4. The cumulative degree distribution of one network 
realization for no = 30, /io = 29, and t = 10 6 (dashed line) 
and the stationary distribution (solid line). In this case, the 
two variants of the goodness-of-fit test provide contradictory 
values p — 0.30 and p* = 0.01. Fits of the degree distribution 
with exponential and normal cutoff (see Section [3.4[) are also 
shown here (the corresponding p and p* values, averaged over 
multiple network realizations, can be found in Figure [HJ|. 



however high values of p* are due to fluctuations of the 
tiny evaluated data) — see the corresponding lines in Fig- 
ure [51 To understand what causes this behavior, it is in- 
structive to plot the cumulative network degree distribu- 
tion and compare it with the stationary distribution. This 
is shown in Figure S] where one can see that the tails of 
these two distributions differ substantially with the net- 
work degree distribution showing cutoff for degree greater 
than approximately 30. Note that this cutoff is exactly the 
reason why the observed k m values reported in Figure [5] 
are lower than expected. Despite the difference in CDFs, 
the goodness-of-fit leads to a threshold-satisfying p-value 
0.25 which suggests a high degree of agreement according 
to the KS statistic. This inability of the KS to detect the 
deviation between the distributions is because it is based 
only on the differences between CDFs which are inevitably 
small at the tail (distance \T(k) — R(k)\ cannot exceed 
max{T(fc), R(k)}). By contrast, the WKS is weighted by 
1 / y/T(k)[l — T(k)\ which makes it more sensitive to CDF 
differences that occur in the tail where T(k) is small and 
allows it to reject the hypothesis of the network degree 
distribution being compatible with the stationary distri- 
bution with p* w 0.01. Note that the approach proposed 
in [20] can be applied also to the WKS and again agrees 
with the findings presented here. 



3.4 Cutoff fitting 

Given the sensitivity of the WKS to the tail behavior, it is 
now natural to use it to study the cutoff type and position 
as a function of no and t. In addition to the usual expo- 
nential cutoff which is often seen in real data [T5] , we test 
also a so-called normal cutoff of the form exp[— (k/X) 2 ] 
which is a special case of the stretched exponential func- 
tion (sometimes it is referred to as compressed exponen- 
tial function because it decays faster than exponentially) . 
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(a) n Q = 10, n = 9: 
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(b)fl = 30,// = 29: 
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(c) n Q = 100, ^ = 99: 
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Fig. 5. p- values, p*-values, and cutoff values A versus t for no = 10, uo = 9 (top row), no = 30, /j,q = 29 (middle row), and 
no = 100, /Uo = 99 (bottom row) with different symbols corresponding to the fitting of different degree distributions: stationary 
degree distribution of the PA model f(k) (red circles), f(k) with exponential cutoff (green squares), and f(k) with normal cutoff 
(blue diamonds). Horizontal dashed lines mark the threshold p- value of 0.1. Results are averaged over 100 network realizations, 
each of which is compared with 1000 draws from the reference distribution. Thick solid lines in the graphs of A serve as guides 
to the eye and have all slope 0.5. 



This choice is further supported by likelihood of the net- 
work degree data: when the cutoff term is assumed in the 
form exp[— (fc/A) 7 ], likelihood of the data reaches its max- 
imum for 7 between 1.5 and 2.5 (as t grows, the maximum 
shifts to higher values). We thus have two candidate dis- 
tributions 



Mk) = 

fn(k) = 



A(A c )e- fc / A ° 
fc(fc + l)(fc + 2)' 

i?(A n )e-( fc / A ") 2 
fc(fc + l)(fc + 2) 



(13) 
(14) 



where A(X e ) and B(X n ) are normalization factors. The 
procedure is now as follows. For a particular network real- 
ization, one chooses the cutoff parameter that maximizes 
likelihood of the data (taking only the new nodes into ac- 
count), p- and p* -value are then computed with respect to 
Eqs. (|13[) and (fT4)) as reference distributions. By averag- 
ing over various network realizations, we obtain statistics 
for A e and A n as well as average values of p and p* which 
measure the goodness of fit. 

Figure[5]summarizes results of the cutoff analysis. First 
of all, it shows the previously mentioned fact that while p- 
valucs obtained for the stationary cutoff-frcc distribution 
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increase fast with t, p* -values of this fit are low and insensi- 
tive to t. Fits with exponential cutoff perform better than 
the original stationary distribution with respect to both p 
and p* but both quantities gradually decrease with t in- 
stead of increasing (which is an unexpected behavior be- 
cause the fit is supposed to improve as the network grows). 
Finally, the normal cutoff performs best and its p and p* 
values do not decay with t. One may wonder how is it pos- 
sible that distributions with cutoff are able to achieve high 
p and p* values even when t is very small and the core part 
of the degree distribution, l/(k(k + l)(fc + 2)), has not yet 
had the time to form. The reason lies in very small cutoff 
values inferred by likelihood estimation in those cases (see 
the values shown in panels in the last column in Figure [3]) 
which results in the distribution shape being dominated 
by the cutoff part instead of the previously- mentioned core 
part. We finally note that for normal cutoff, the cutoff pa- 
rameter values are proportional to i ' 5 . This is the same 
scaling as we found earlier for (fc m ). This is understand- 
able: normal cutoff is sharp and its position is mainly 
influenced by the highest degree values occurring in the 
network. 



3.5 Comparison with an analytical solution 

After the original submission of our manuscript, an ana- 
lytical work has been published where Z-transform is used 
to find the degree distribution as a function of time for a 
growing network with an arbitrary initial condition |21j . 
When the network growth obeys preferential attachment, 
their final result given in Eqs. (81) and (82) can be adapted 
to our setting and yields the degree distribution of the new 
nodes in the form 



P(M) = 



2t 



t 



2c 



1 



(15) 



where c = 1 — y / no/xo/(no^o + 2t). (The first term of 
Eq. (81) does not appear here because it describes the 
contribution of the initial nodes. The normalization is 
changed from l/(/io + t) to \ jt because our P(k,t) covers 
t new nodes instead of all fj,o + 1 nodes as in [21] ■) This 
result agrees well with our simulations. 

When t — > oo, c = 1 and P(k, t) reduces to Eq. ((3]) as it 
has to. However, one can write c = 1 — ^/x' 2 /(l + x 2 ) where 

x := y n o n o/(2t) and consequently find an expansion of 
P(k, t) in powers of x. The leading order part of the result, 



P(k,t) 



4(l-l(fcz) 3 + l(fcx) 4 + Q((fcx) 5 )) 
fc(fc + l)(fc + 2) 



(16) 



contains the stationary solution and correction terms pro- 
portional to kx and its powers. While x vanishes as t —> 00, 
the growing network allows us to inspect P(k, t) at higher 
values of k. Assuming that the stationary distribution 
eventually establishes itself over the whole range of rel- 
evant degrees, the expected largest degree is (fc m ) ~ V8t 
(as shown in Section IX2"j) . This means that the correction 
terms k m x are independent of t and thus do not vanish: 



a deviation between the stationary distribution and the 
"visible part" of P(k, t) persists. The analytical form of 
P(k, t) given in Eq. (|T5|) thus confirms the statistical tests 
of model degree distributions reported above. 



4 Conclusion 

The lack of attention to the importance of initial con- 
ditions in network models is best illustrated by thirteen 
years separating the original publication of the preferen- 
tial attachment model [T] and the analytical result for the 
model's degree distribution upon arbitrary analytical con- 
ditions [21]. We studied the sensitivity of the Barabasi- 
Albert model of a growing network to the initial network 
from which the growth starts. We found that the well- 
known stationary distribution f(k) = A/[k(k + l)(k + 2)] 
forms only when the number of the initial nodes are few 
and they are sparsely interconnected. We demonstrated 
that as soon as the starting degree of the initial nodes /io 
exceeds 3, this little advantage allows them to attract an 
excessive number of links in the future so that they never 
merge with the stationary degree distribution of the nodes 
that are introduced later in the network's evolution. 

When focusing only on the newly added nodes and 
their degree, we showed that their stationary degree dis- 
tribution is the same as that of the original model regard- 
less of the number of initial nodes uq and their degree 
/io. There are various ways how to define the time needed 
to approach this distribution. If we define the equilibra- 
tion time simply on the basis of the average degree of the 
new nodes, it is proportional to n\ in the case of the com- 
plete initial network (and proportional to no/io in general) 
which suggests rather fast equilibration. On the basis of 
the standard goodncss-of-fit test with the Kolmogorov- 
Smirnov statistic, the equilibration time grows with hq 
much faster — the exponent is around 4.2 for the complete 
initial network and 2.1 when /io is fixed. 

However, no equilibration is found in two other cases 
which are in fact closely related. In the first case, we 
showed that when uq > 10, the average maximal degree of 
the new nodes is and stays significantly smaller than the 
value predicted from the stationary distribution. In the 
second case, we showed that when the usual Kolmogorov- 
Smirnov statistic is replaced by the weighted Kolmogorov- 
Smirnov statistic which puts more weight on the tail of a 
distribution, the hypothesis that the network degree dis- 
tributions are drawn from the stationary distribution of 
the PA model is rejected for no > 10 (for complete initial 
networks). The reason for these two observations lies in 
a distribution cutoff which shifts to higher degree values 
as the network grows (thus the eventual convergence to 
f(k) in the functional form) but remains present and de- 
tectable for any finite network size. One can conclude that 
with respect to more sophisticated equilibration criteria, 
degree distributions of the PA model equilibrate slowly 
(with respect to the KS) or they do not equilibrate at all 
(with respect to the WKS). These results are confirmed by 
a recently published analytical form of the degree distri- 
bution of the PA model for an arbitrary initial condition. 
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Fig. 6. Data size n needed for the assumed power law fc _m to 
yield p or p* less than 0.1 (the hypothesis is rejected) when 
the input data follows k~ m exp[— (k/X) 2 ]. Results obtained 
for m = 3 with p and p* are shown with the thick solid and 
dashed line, respectively, p* outperforms p in blue-shaded re- 
gions (shown for m = 2.5,3,3.5,4). For small n and X, there 
are also regions where p outperforms p* (for clarity shown only 
for m = 3 and marked with red stripes). 



Note that models of network growth where aging of nodes 
is considered [r21H31[T4"] naturally depend less on the ini- 
tial network configuration. One can thus expect that these 
models not only solve the problem of node degree strongly 
biased by time (as is the case for PA) but also that of 
the lack of equilibration. Wc studied also other common 
network characteristics, clustering coefficient and assorta- 
tivity, and found that their overall behavior is not altered 
by the presence of a non-trivial initial network. They both 
vanish in the limit of t —> 00, albeit at rates which depend 
on no and /j,q. 

We finally stress that there is a more general lesson 
to be learned here. Despite the conventional wisdom [IS] , 
the standard and weighted Kolmogorov-Smirnov statis- 
tic may perform very differently on power-law data with 
cutoff. When the cutoff is located at large values of a vari- 
able, it may remain invisible to the standard Kolmogorov- 
Smirnov statistic which then accepts data as being plausi- 
bly generated by a given distribution. By contrast, sensi- 
tivity of the WKS statistic is distributed more evenly over 
the range of possible values which improves its ability to 
detect cutoffs and estimate their parameters (such as po- 
sition and shape, for example). This is demonstrated in 
Figures [5] where the data size needed to reject the power 
law hypothesis for a data generated by a power law with 
normal cutoff is shown as a function of the cutoff posi- 
tion. When the data is big enough, the p*-value test can 
"detect" higher cutoff values than the p- value test which 
makes it a preferable choice in a wide range of param- 
eters. Regions where p* outperform p are smaller when 
the actual cutoff has an exponential form. When the data 
is small (n < 1000) or the power-law exponent is high 
(four or more), it is still advisable to use the standard 
Kolmogorov-Smirnov statistic. 
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